Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization

نویسندگان

Todd W. Neller

Steven Hnath

چکیده

Using the bluffing dice game Dudo as a challenge domain, we abstract information sets using imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, with the number of recursive visits to the same abstracted information sets increasing exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we show how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Introduction to Counterfactual Regret Minimization

In 2000, Hart and Mas-Colell introduced the important game-theoretic algorithm of regret matching. Players reach equilibrium play by tracking regrets for past plays, making future plays proportional to positive regrets. The technique is not only simple and intuitive; it has sparked a revolution in computer game play of some of the most difficult bluffing games, including clear domination of ann...

متن کامل

Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

Recently, there has been considerable progress towards algorithms for approximating Nash equilibrium strategies in extensive games. One such algorithm, Counterfactual Regret Minimization (CFR), has proven to be effective in two-player zero-sum poker domains. While the basic algorithm is iterative and performs a full game traversal on each iteration, sampling based approaches are possible. For i...

متن کامل

Strategy-Based Warm Starting for Regret Minimization in Games

Counterfactual Regret Minimization (CFR) is a popular iterative algorithm for approximating Nash equilibria in imperfect-information multi-step two-player zero-sum games. We introduce the first general, principled method for warm starting CFR. Our approach requires only a strategy for each player, and accomplishes the warm start at the cost of a single traversal of the game tree. The method pro...

متن کامل

Monte Carlo Sampling for Regret Minimization in Extensive Games

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...

متن کامل

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a smaller, sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization

نویسندگان

چکیده

منابع مشابه

An Introduction to Counterfactual Regret Minimization

Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

Strategy-Based Warm Starting for Regret Minimization in Games

Monte Carlo Sampling for Regret Minimization in Extensive Games

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

عنوان ژورنال:

اشتراک گذاری